{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Static Badge](https://img.shields.io/badge/tulog-open_in_colab-blue?style=flat&logo=googlecolab&color=blue)](https://colab.research.google.com/drive/1oYTqKe7GGIgvB7VffH0xllg77zUBpd7S?usp=drive_link)\n",
    "\n",
    "### We recommend using the [Google Colab](https://colab.research.google.com/drive/1oYTqKe7GGIgvB7VffH0xllg77zUBpd7S?usp=drive_link) verion of the notebook!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ryw_yqg9JW9E"
   },
   "source": [
    "# Using Orion for Custom Data\n",
    "\n",
    "This notebook is quick tutorial to use Orion on custom data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "5v9QLoXoJ_Sv"
   },
   "source": [
    "## Step 1: Load your data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "GgwONHlSI9oo",
    "outputId": "2f7cc935-3c2b-4a17-9df9-17ab3efcc7dc"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>timestamp</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1222819200</td>\n",
       "      <td>-0.366359</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1222840800</td>\n",
       "      <td>-0.394108</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1222862400</td>\n",
       "      <td>0.403625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1222884000</td>\n",
       "      <td>-0.362759</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1222905600</td>\n",
       "      <td>-0.370746</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    timestamp     value\n",
       "0  1222819200 -0.366359\n",
       "1  1222840800 -0.394108\n",
       "2  1222862400  0.403625\n",
       "3  1222884000 -0.362759\n",
       "4  1222905600 -0.370746"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# signal from orion library that you can replace with your own data\n",
    "\n",
    "from orion.data import load_signal\n",
    "\n",
    "data = load_signal('S-1')\n",
    "\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vNg-4g7LR_7N"
   },
   "source": [
    "If your data is not following the Orion standard, we need to format it such that it contains two columns:\n",
    "- **timestamp**: an integer representation of time.\n",
    "- **values**: the observed value of the time series at that specific time.\n",
    "\n",
    "Format the data if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "MFk7LSG9Rj6b",
    "outputId": "a64ec9fd-93b6-4810-dcc5-179a4351fd4e"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>timestamp</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1222819200</td>\n",
       "      <td>-0.366359</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1222840800</td>\n",
       "      <td>-0.394108</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1222862400</td>\n",
       "      <td>0.403625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1222884000</td>\n",
       "      <td>-0.362759</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1222905600</td>\n",
       "      <td>-0.370746</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    timestamp     value\n",
       "0  1222819200 -0.366359\n",
       "1  1222840800 -0.394108\n",
       "2  1222862400  0.403625\n",
       "3  1222884000 -0.362759\n",
       "4  1222905600 -0.370746"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "# convert the timestamp column into timestamps (integer values)\n",
    "\n",
    "timestamps = pd.to_datetime(data['timestamp'], unit='s')\n",
    "data['timestamp'] = timestamps.values.astype(np.int64) // 10 ** 9\n",
    "\n",
    "# rename columns in the dataframe to two important columns: timestamp, and value\n",
    "\n",
    "data = data.rename({\"timestamp\": \"timestamp\", \"value\": \"value\"})\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7LHEJkdPM6F7"
   },
   "source": [
    "## Step 2: Run Orion\n",
    "\n",
    "Use Orion to find anomalies in your time series signal.\n",
    "\n",
    "Orion provides a collection of anomaly detection pipelines which you can choose from. You can view the pipelines and their ranking in our [leaderbord](https://github.com/sintel-dev/Orion?tab=readme-ov-file#leaderboard).\n",
    "\n",
    "In this tutorial, we will use `AER` model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "EwmuT095Mu8t",
    "outputId": "5c8b2fac-aa77-47d9-bc09-9620a14a35be"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.\n",
      "WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.\n",
      "WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.\n",
      "WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.\n",
      "/opt/anaconda3/envs/orion/lib/python3.10/site-packages/sklearn/impute/_base.py:356: FutureWarning: The 'verbose' parameter was deprecated in version 1.1 and will be removed in 1.3. A warning will always be raised upon the removal of empty columns in the future version.\n",
      "  warnings.warn(\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/5\n",
      "126/126 [==============================] - 7s 32ms/step - loss: 0.1936 - tf.__operators__.getitem_loss: 0.1878 - tf.__operators__.getitem_1_loss: 0.1999 - tf.__operators__.getitem_2_loss: 0.1869 - val_loss: 0.2260 - val_tf.__operators__.getitem_loss: 0.2164 - val_tf.__operators__.getitem_1_loss: 0.2305 - val_tf.__operators__.getitem_2_loss: 0.2266\n",
      "Epoch 2/5\n",
      "126/126 [==============================] - 3s 27ms/step - loss: 0.1843 - tf.__operators__.getitem_loss: 0.1772 - tf.__operators__.getitem_1_loss: 0.1913 - tf.__operators__.getitem_2_loss: 0.1771 - val_loss: 0.2250 - val_tf.__operators__.getitem_loss: 0.2183 - val_tf.__operators__.getitem_1_loss: 0.2290 - val_tf.__operators__.getitem_2_loss: 0.2238\n",
      "Epoch 3/5\n",
      "126/126 [==============================] - 3s 28ms/step - loss: 0.1801 - tf.__operators__.getitem_loss: 0.1724 - tf.__operators__.getitem_1_loss: 0.1878 - tf.__operators__.getitem_2_loss: 0.1724 - val_loss: 0.2213 - val_tf.__operators__.getitem_loss: 0.2263 - val_tf.__operators__.getitem_1_loss: 0.2142 - val_tf.__operators__.getitem_2_loss: 0.2305\n",
      "Epoch 4/5\n",
      "126/126 [==============================] - 3s 28ms/step - loss: 0.1773 - tf.__operators__.getitem_loss: 0.1675 - tf.__operators__.getitem_1_loss: 0.1870 - tf.__operators__.getitem_2_loss: 0.1676 - val_loss: 0.2200 - val_tf.__operators__.getitem_loss: 0.2354 - val_tf.__operators__.getitem_1_loss: 0.2120 - val_tf.__operators__.getitem_2_loss: 0.2207\n",
      "Epoch 5/5\n",
      "126/126 [==============================] - 4s 28ms/step - loss: 0.1752 - tf.__operators__.getitem_loss: 0.1643 - tf.__operators__.getitem_1_loss: 0.1862 - tf.__operators__.getitem_2_loss: 0.1642 - val_loss: 0.2191 - val_tf.__operators__.getitem_loss: 0.2307 - val_tf.__operators__.getitem_1_loss: 0.2125 - val_tf.__operators__.getitem_2_loss: 0.2208\n"
     ]
    }
   ],
   "source": [
    "from orion import Orion\n",
    "\n",
    "hyperparameters = { # alter the hyperparameter settings here\n",
    "    'mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1': {\n",
    "        'interval': 21600\n",
    "    },\n",
    "    'orion.primitives.aer.AER#1': {\n",
    "        'epochs': 5,\n",
    "        'verbose': True\n",
    "    }\n",
    "}\n",
    "\n",
    "orion = Orion(\n",
    "    pipeline='aer',\n",
    "    hyperparameters=hyperparameters\n",
    ")\n",
    "\n",
    "orion.fit(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QlxHtAIcPmut"
   },
   "source": [
    "Now we can use the trained pipeline to find anomalies:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 115
    },
    "id": "l0XoaR5YPU6v",
    "outputId": "507a4873-1dd8-4d5c-973d-d2f47c813ecf"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "315/315 [==============================] - 1s 2ms/step\n",
      "315/315 [==============================] - 1s 3ms/step\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>start</th>\n",
       "      <th>end</th>\n",
       "      <th>severity</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1400716800</td>\n",
       "      <td>1405404000</td>\n",
       "      <td>0.117529</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        start         end  severity\n",
       "0  1400716800  1405404000  0.117529"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "orion.detect(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "6Fo50yWVPr2r"
   },
   "source": [
    "For any questions, please open an issue on\n",
    "[Orion github](https://github.com/sintel-dev/Orion)!"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
